Goto

Collaborating Authors

 voronoi cell


Efficient Algorithms for Robust and Partial Semi-Discrete Optimal Transport

Neural Information Processing Systems

The sensitivity of optimal transport (OT) to noise has motivated the study of robust variants. In this paper, we study two such formulations of semi-discrete OT in Rd: (i) the ฮฑ-optimal partial transport, which minimizes the cost of transporting a mass of ฮฑ; and (ii) the ฮป-robust optimal transport, which regularizes the OT problem using the total variation (TV) distance. First, we provide a novel characterization of the optimal solutions in these settings, showing they can be represented as a restricted Laguerre diagram. Second, we exploit this characterization to establish a strong algorithmic connection between the two problems, showing that any solver for one can be adapted to solve the other with comparable precision. Third, we overcome key challenges posed in extending the cost-scaling paradigm to compute these variants of OT and present an algorithm that computes the exact solution up to log(1/ฮต) bits of precision in nO(d) log(1/ฮต) time, where nis the support size of the discrete distribution.


ACombinatorialAlgorithmfortheSemi-Discrete OptimalTransportProblem

Neural Information Processing Systems

In the semi-discrete2-Wasserstein problem, we wish to compute the cheapest way to transport all the mass from a continuous distribution ยต to a discrete distributionฮฝ in Rd for d 1, where the cost of transporting unitmassbetween pointsaandbisd(a,b)= a b 2. When both distributions are discrete, a simple combinatorial framework has been used to find the exact solution (see e.g.


Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps

arXiv.org Machine Learning

We develop a unified statistical framework for softmax-gated Gaussian mixture of experts (SGMoE) that addresses three long-standing obstacles in parameter estimation and model selection: (i) non-identifiability of gating parameters up to common translations, (ii) intrinsic gate-expert interactions that induce coupled differential relations in the likelihood, and (iii) the tight numerator-denominator coupling in the softmax-induced conditional density. Our approach introduces Voronoi-type loss functions aligned with the gate-partition geometry and establishes finite-sample convergence rates for the maximum likelihood estimator (MLE). In over-specified models, we reveal a link between the MLE's convergence rate and the solvability of an associated system of polynomial equations characterizing near-nonidentifiable directions. For model selection, we adapt dendrograms of mixing measures to SGMoE, yielding a consistent, sweep-free selector of the number of experts that attains pointwise-optimal parameter rates under overfitting while avoiding multi-size training. Simulations on synthetic data corroborate the theory, accurately recovering the expert count and achieving the predicted rates for parameter estimation while closely approximating the regression function. Under model misspecification (e.g., $ฮต$-contamination), the dendrogram selection criterion is robust, recovering the true number of mixture components, while the Akaike information criterion, the Bayesian information criterion, and the integrated completed likelihood tend to overselect as sample size grows. On a maize proteomics dataset of drought-responsive traits, our dendrogram-guided SGMoE selects two experts, exposes a clear mixing-measure hierarchy, stabilizes the likelihood early, and yields interpretable genotype-phenotype maps, outperforming standard criteria without multi-size training.


Cellular Learning: Scattered Data Regression in High Dimensions via Voronoi Cells

arXiv.org Artificial Intelligence

I present a regression algorithm that provides a continuous, piecewise-smooth function approximating scattered data. It is based on composing and blending linear functions over Voronoi cells, and it scales to high dimensions. The algorithm infers Voronoi cells from seed vertices and constructs a linear function for the input data in and around each cell. As the algorithm does not explicitly compute the Voronoi diagram, it avoids the curse of dimensionality. An accuracy of around 98.2% on the MNIST dataset with 722,200 degrees of freedom (without data augmentation, convolution, or other geometric operators) demonstrates the applicability and scalability of the algorithm.






Distributed Multi-robot Source Seeking in Unknown Environments with Unknown Number of Sources

arXiv.org Artificial Intelligence

We introduce a novel distributed source seeking framework, DIAS, designed for multi-robot systems in scenarios where the number of sources is unknown and potentially exceeds the number of robots. Traditional robotic source seeking methods typically focused on directing each robot to a specific strong source and may fall short in comprehensively identifying all potential sources. DIAS addresses this gap by introducing a hybrid controller that identifies the presence of sources and then alternates between exploration for data gathering and exploitation for guiding robots to identified sources. It further enhances search efficiency by dividing the environment into Voronoi cells and approximating source density functions based on Gaussian process regression. Additionally, DIAS can be integrated with existing source seeking algorithms. We compare DIAS with existing algorithms, including DoSS and GMES in simulated gas leakage scenarios where the number of sources outnumbers or is equal to the number of robots. The numerical results show that DIAS outperforms the baseline methods in both the efficiency of source identification by the robots and the accuracy of the estimated environmental density function.


Convergence Rates for Softmax Gating Mixture of Experts

arXiv.org Machine Learning

Mixture of experts (MoE) has recently emerged as an effective framework to advance the efficiency and scalability of machine learning models by softly dividing complex tasks among multiple specialized sub-models termed experts. Central to the success of MoE is an adaptive softmax gating mechanism which takes responsibility for determining the relevance of each expert to a given input and then dynamically assigning experts their respective weights. Despite its widespread use in practice, a comprehensive study on the effects of the softmax gating on the MoE has been lacking in the literature. To bridge this gap in this paper, we perform a convergence analysis of parameter estimation and expert estimation under the MoE equipped with the standard softmax gating or its variants, including a dense-to-sparse gating and a hierarchical softmax gating, respectively. Furthermore, our theories also provide useful insights into the design of sample-efficient expert structures. In particular, we demonstrate that it requires polynomially many data points to estimate experts satisfying our proposed \emph{strong identifiability} condition, namely a commonly used two-layer feed-forward network. In stark contrast, estimating linear experts, which violate the strong identifiability condition, necessitates exponentially many data points as a result of intrinsic parameter interactions expressed in the language of partial differential equations. All the theoretical results are substantiated with a rigorous guarantee.